Population size, HIV prevalence, and antiretroviral therapy coverage among key populations in sub-Saharan Africa: collation and synthesis of survey data, 2010–23

Summary Background Key population HIV programmes in sub-Saharan Africa require epidemiological information to ensure equitable and universal access to effective services. We aimed to consolidate and harmonise survey data among female sex workers, men who have sex with men, people who inject drugs, and transgender people to estimate key population size, HIV prevalence, and antiretroviral therapy (ART) coverage for countries in mainland sub-Saharan Africa. Methods Key population size estimates, HIV prevalence, and ART coverage data from 39 sub-Saharan Africa countries between 2010 and 2023 were collated from existing databases and verified against source documents. We used Bayesian mixed-effects spatial regression to model urban key population size estimates as a proportion of the gender-matched, year-matched, and area-matched population aged 15–49 years. We modelled subnational key population HIV prevalence and ART coverage with age-matched, gender-matched, year-matched, and province-matched total population estimates as predictors. Findings We extracted 2065 key population size data points, 1183 HIV prevalence data points, and 259 ART coverage data points. Across national urban populations, a median of 1·65% (IQR 1·35–1·91) of adult cisgender women were female sex workers, 0·89% (0·77–0·95) were men who have sex with men, 0·32% (0·31–0·34) were men who injected drugs, and 0·10% (0·06–0·12) were women who were transgender. HIV prevalence among key populations was, on average, four to six times higher than matched total population prevalence, and ART coverage was correlated with, but lower than, the total population ART coverage with wide heterogeneity in relative ART coverage across studies. Across sub-Saharan Africa, key populations were estimated as comprising 1·2% (95% credible interval 0·9–1·6) of the total population aged 15–49 years but 6·1% (4·5–8·2) of people living with HIV. Interpretation Key populations in sub-Saharan Africa experience higher HIV prevalence and lower ART coverage, underscoring the need for focused prevention and treatment services. In 2024, limited data availability and heterogeneity constrain precise estimates for programming and monitoring trends. Strengthening key population surveys and routine data within national HIV strategic information systems would support more precise estimates. Funding UNAIDS, Bill & Melinda Gates Foundation, and US National Institutes of Health.


Supplementary Text S1: Study population definitions and inclusion criteria based on risk behaviours and time periods
Inclusion criteria were extracted from 96 out of 247 studies.79% of studies (76/96) reported an inclusion criterion based on engaging in risk behaviour within a specific recent time period and 88% (84/96) reported specific risk behaviours that defined inclusion in the population group.
Among FSW, 68% (26/38) used a broad definition of sex work including exchanging sex for money, goods, or favours; 21% (8/38) required women to be commercial sex workers, or exchange sex for money only; and the remaining 4 surveys recruited women who self-identified as FSW or attended FSW clinics.Among MSM, all surveys required men to have had either anal sex (20%; 6/28) or oral or anal sex (80%; 24/28).
Among PWID, 53% (8/15) surveys required injection of any drug, and a further 27% (4/15) specified injecting drug use of heroin, cocaine, or methamphetamine.The remaining surveys recruited people who used and injected drugs, from which this analysis extracted data on those who injected drugs.

Supplementary Text S2: Key population size estimate method classifications
The five classifications for key population size estimates (KPSE) methods were: • 2S-CRC: two-source capture-recapture methods included object, service, and event multiplier methods.
• PLACE/mapping: consisted of estimates using the Priorities for Local AIDS Control Efforts (PLACE) methodology and other programmatic/hotspot mapping derived estimates.
• SS-PSE: successive sampling population size estimates, e.g.respondent-driven or snowball sampling.Within a single survey, multiple size estimation methods were commonly conducted and combined into a final consensus KPSE; where possible, separate estimates for each method were extracted.For cases where only a final estimate of multiple methods was reported, two further categories were defined: "Multiple methods -empirical" or "Multiple methods -mixture".The former contained estimates derived from multiple of the five methods above, while the latter were derived from both empirical and nonempirical methods (enumeration, wisdom of the crowds, key informant interviews, and the Delphi method).KPSEs derived by solely non-empirical methods were excluded from analyses.

Supplementary Text S3: Viral load suppression observations
The definition of viral load suppression differed between studies.Most studies (n=25) used a threshold of ≤1000 copies/ml, one study each used thresholds of ≤500, ≤400, ≤398, and ≤200 copies/ml, and 5 studies used ≤50 copies/ml.Two studies (Nigeria 2020 BBS 1,2 and Democratic Republic of Congo 2019 BBS) did not report a viral load suppression threshold and were assumed to be 1000 copies/ml.To standardise viral load suppression observations at 1000 copies/ml, we used the adjustment from Johnson et al. (2021) 3 , using the Weibull distribution with a shape parameter of 0.85.
Second, we converted viral load suppression estimates into estimates of ART coverage to facilitate comparison with total population ART coverage estimates.Five South African studies published both ART metabolite biomarker-confirmed ART usage and viral load suppression estimates.We meta-analysed these studies and estimated the average logit(ART)logit(VLS) = -0.32.To convert viral load suppression observations into ART coverage, we used -0.32 as a regression offset.

1.
Nigeria Ministry of Health.
The rural-to-urban ratio was assumed to be a (5,3) distribution, which had mean ratio of 0.6 and 80% of the mass between 0.4 and 0.8, allowing wide uncertainty to be propagated into the estimates of  ,

Key population HIV prevalence
We modelled the relationship between key population HIV prevalence and sex-matched total population HIV prevalence (age 15-49 years) in the same admin-1 region separately for FSW and PWID, and together for MSM and TGW.
The number of key population members living with HIV,  ,,,, , is assumed to follow a beta-binomial distribution with expected HIV prevalence,  ,,,, for a given age group , province , year , country , and region  ∈ {ESA, WCA}.Logit transformed prevalence ( ,,,, ) is expressed a linear model with an intercept  0 , fixed effects for matched total population HIV prevalence ( ,,,, ); region  ∈ , ; an interaction between matched total population HIV prevalence and region; an intrinsic conditional autoregressive (ICAR) spatial smoothing random effect at the provincial and national levels,   for  ∈ 1,2, . . .  and   for  ∈ 1,2 … 39 respectively, and a study iid random effect,   for  ∈ 1,2, … , .

Key population ART coverage
We modelled key population ART coverage as a function of total population ART coverage, analogously to HIV prevalence.All key populations were modelled together.In sensitivity analysis, we added a fixed effect for study method,  where  = 0 for laboratory confirmed ART and  = 1 for self-reported ART usage.
National gender-matched ART coverage in 2022 from Spectrum files were used to estimate national key population ART coverages (Supplementary Table S11).Spectrum estimates gender-specific ART coverage by dividing gender-specific ART programme counts over the gender-specific estimate of the number of people living with HIV.
Misspecification of the sex ratio of new HIV infections can lead to an under-enumeration of women living with HIV and an over-enumeration of men living with HIV.This leads to an overestimate of ART coverage among women, and an underestimate of ART coverage among men.We assessed this to have occurred in Benin, Burkina Faso, Malawi, Senegal, and Sierra Leone which had implausibly large differences between female and male ART coverage, defined as (ART coverage  ) − (ART coverage  ) > 2. We adjusted the logit difference of gender-specific ART coverages in these countries to be 0.72, the median logit difference across the remaining 34 countries in SSA.
Supplementary Figure S3: Age group sensitivity analysis for FSW and MSM population proportion.Key population surveys may recruit individuals younger than the 15-49 year old denominator as assumed in primary analysis (yellow; main text Figure 3B).This sensitivity analysis estimates urban PSE proportions for MSM using all men aged 15-29 as the matched total population denominator (blue), and for FSW using all women aged 15-39.This increases the Supplementary Table S1 One of: • Two source capture recapture (2S-CRC) • Three source capture recapture (3S-CRC) When several methods were used to create a median or consensus estimates, the individual method estimates were recorded, and the median estimate was not.In cases where only the median was reported two further categories are defined: • Multiple methods -empirical: All methods used to create the median estimate were from the eight methods listed above.

• Multiple methods -mixture:
Methods used to create the median estimate were a mixture of one or more of the eight methods above, plus a non-empirical method (e.g.wisdom of the crowds, enumeration, literature review).
Methods employed to assess HIV prevalence.
One of: • Laboratory confirmed: Serologically confirmed HIV status through point-of-care rapid test or laboratory confirmation.• Self-report: Self-reported HIV status.
Methods employed to assess HIV prevalence.
One of: • Laboratory confirmed: Presence of ART metabolites confirmed through laboratory testing.

• VLS:
For studies reporting the proportion of the population that was virally suppressed, rather than on treatment, this proportion was divided by 0.9 to approximate ART coverage.Log odds ratios for population size estimates by methods.Two fixed effect categories were estimated: Empirical and PLACE/Mapping.Random effects were estimated for empirical methods (Two and three source CRC (2S-and 3S-CRC), network scale-up (NSUM), successive sampling population size estimation (SS-PSE), and average estimates from multiple empirical methods and estimates derived from a mixture of empirical and non-empirical methods (Multiple methodsmixed)).See Supplementary Figure S3 for a graphical representation of population size estimate method effects.

Regional Average
4.0 (2.7, 5.7) 1.4 (0.9, 2.0) 0.5 (0.3, 0.8) 0.2 (0.1, 0.5) FSW: female sex workers; MSM: men who have sex with men; PWID: people who inject drugs; TGW: transgender women; PLHIV: people living with HIV; ESA: Eastern and Southern Africa; WCA: Western and Central Africa; SSA: sub-Saharan Africa PSE proportions as the denominator has decreased.The dotted line on the MSM plot represents the UNAIDS/WHO recommended minimum population size proportion of 1% of total population men.MSM: Men who have sex with men Supplementary FigureS4: Population size method effects.Log odds ratios for population size estimates by methods.Two fixed effect categories were estimated: Empirical and PLACE/Mapping.Random effects were estimated for empirical methods (Two and three source CRC (2S-and 3S-CRC), network scale-up (NSUM), successive sampling population size estimation (SS-PSE), and average estimates from multiple empirical methods and estimates derived from a mixture of empirical and non-empirical methods (Multiple methodsmixed)).See Supplementary TableS4for a tabular representation of population size estimate method effects.Each key population was estimated in separate regression models, but results presented together to enable comparison of estimates.PLACE: Priorities for Local AIDS Control Efforts; FSW: female sex workers; MSM: men who have sex with men; PWID: people who inject drugs; TGW: transgender women Supplementary FigureS5: Sensitivity analysis for imputed tested denominators of HIV prevalence observations.The main results present estimates with observations missing tested denominators imputed using the 25 th centile of key population-matched known denominators.This sensitivity analysis shows two further imputations, using the median (50 th centile) of known denominators, and the 75 th centile.
and   , respectively, and a study iid random effect, Integrated Biological & Behavioural Surveillance Survey (IBBSS) among Key Populations in Nigeria.; 2020.

Table S4 : Population size estimate method regression results
• Self-report: Self-reported ART status Area Surveillance area (e.g.city, district) Surveillance area (e.g.city, district) Surveillance area (e.g.city, district)

Table S5 : HIV prevalence model regression results among female sex workers.
. PLACE: Priorities for Local AIDS Control Efforts; MM: Multiple methods; FSW: female sex workers; MSM: men who have sex with men; PWID: people who inject drugs; TGW: transgender women Supplementary